3 research outputs found
Information Processing in a Cognitive Model of NLP
A model of the cognitive process of natural language processing has been developed using the
formalism of generalized nets. Following this stage-simulating model, the treatment of information inevitably
includes phases, which require joint operations in two knowledge spaces â language and semantics. In order to
examine and formalize the relations between the language and the semantic levels of treatment, the language is
presented as an information system, conceived on the bases of human cognitive resources, semantic primitives,
semantic operators and language rules and data. This approach is applied for modeling a specific grammatical
rule â the secondary predication in Russian. Grammatical rules of the language space are expressed as
operators in the semantic space. Examples from the linguistics domain are treated and several conclusions for
the semantics of the modeled rule are made. The results of applying the information system approach to the
language turn up to be consistent with the stages of treatment modeled with the generalized net
Lojbanic English, An Interlingua for Parallel Machine Translation
We investigated machine translation using the interlingua Lojban, and our own extensions, i.e., Lojbanic English. Lojban avoids ambiguity because of its 1342 primitive predicates, and no polysemy. Lojbanic English was tested for a wide variety of sentence typesâyielding 503 Lojban/Lojbanic English sentence tests. We developed a translator from English to Lojbanic English, using our 35 generic sentence patterns. Ambiguity was avoided, but unforeseen patterns were yet to be considered. We also investigated other anomalies that Lojban would be (mostly) able to avoidâgrammatical usage errors by Swan. We implemented 80 of most common 130 errors. The test suite was The Brown corpus consisting of 55889 sentences. Our system detected 35 true positives distributed among 15 of Swanâs rules. A low true positive rate, 35/55889, had been expected. No false positives were detected. When writing in Lojban one adds new predicates incrementally; this is very time consuming, To address Lojban's insufficient vocabulary, we developed an interactive algorithm which will recast WordNet synonym setsâ definitions into existing Lojban primitive predicates. The output is in terms of our Lojbanic English. If a relevant subset, e.g., 1/10, of the unique synonym-set definitionsâtotaling 116718, are converted into Lojban predicates, then 1.945 man years would be required for this effort. To avoid unforeseen syntactic patterns or implicit semantics, our Lojbanic English is such that the user writes only in terms of (Lojban) structure words, named entities and Lojbanic English predicates. Off-line, English phrases of the input sentence are mapped into Lojbanic English, e.g., linking clauses or phrases. Sufficient generality is employed to allow for reuse of the English phrases. At runtime, given an arbitrary English sentence, any errors in the final Lojbanic English generated are detected and corrected. Swan usage errors are avoided. User's skill is required. This specification of Lojbanic English gives rise to a bijective functionâEnglish words can be automatically replaced with foreign language counter parts, in parallel. We are able to translate Lojbanic English into some Lojbanic foreign language, and back (via Lojban); the result is identical to the original text. Thus, we are completely confident in the translation
Using Database Tables and a Non-standard Neural Network Model, for Internal Cognitive Representations 1. Language as an Information System
One of the primary goals of a human language is to assure the information exchange between individuals. Information, residing as internal cognitive representation of the individual H1 is presented as language-coded information, communicated to another individual H2, and interpreted to internal cognitive representation of H2 (figure 1). The âinternal cognitive representation â is considered as related to a semantic description of the world. As the information transmission relates two internal representations, it could be thought that the language communication builds a âsemantic channel â (Figure 1). To conceive an information system (IS), designers use the representation âinput- treatment block â outputâ. On its input, an IS receives data and resources and on its output- obtains informative products. The treatment block functions on the bases of a particular model, which includes a number of rules and operators on data. Data emerge from a data-source, external for the system. For the correct functioning of the IS, it is essential to guarantee a permanent link between data-sources and their data-images on the systemâs input. That requires categorization of data and their storage in separate data-containers, in a non-redundant way. IS engineers apply semantic modeling in order to present the data-source as a cybernetic system and, on this bases, to build a structure of data-containers, matching the model of the source. This approach is well-known in the IS domain (see for example Codd, 1979). input Resources â Cognitive resources Data â coded internal representation H1 Treatment block Language Model and method for dataprocessing. Rules and operators. Treatment â uses cognitive resources. categorization of data Organized storage Semantic channel internal representation of Language Information Syste